58 research outputs found
Statistical mechanics of RNA folding: importance of alphabet size
We construct a minimalist model of RNA secondary-structure formation and use
it to study the mapping from sequence to structure. There are strong,
qualitative differences between two-letter and four or six-letter alphabets.
With only two kinds of bases, there are many alternate folding configurations,
yielding thermodynamically stable ground-states only for a small set of
structures of high designability, i.e., total number of associated sequences.
In contrast, sequences made from four bases, as found in nature, or six bases
have far fewer competing folding configurations, resulting in a much greater
average stability of the ground state.Comment: 7 figures; uses revtex
Louse (Insecta : Phthiraptera) mitochondrial 12S rRNA secondary structure is highly variable
Lice are ectoparasitic insects hosted by birds and mammals. Mitochondrial 12S rRNA sequences obtained from lice show considerable length variation and are very difficult to align. We show that the louse 12S rRNA domain III secondary structure displays considerable variation compared to other insects, in both the shape and number of stems and loops. Phylogenetic trees constructed from tree edit distances between louse 12S rRNA structures do not closely resemble trees constructed from sequence data, suggesting that at least some of this structural variation has arisen independently in different louse lineages. Taken together with previous work on mitochondrial gene order and elevated rates of substitution in louse mitochondrial sequences, the structural variation in louse 12S rRNA confirms the highly distinctive nature of molecular evolution in these insects
Control of Cognate Sense mRNA Translation by cis-Natural Antisense RNAs.
Cis-Natural Antisense Transcripts (cis-NATs), which overlap protein coding genes and are transcribed from the opposite DNA strand, constitute an important group of noncoding RNAs. Whereas several examples of cis-NATs regulating the expression of their cognate sense gene are known, most cis-NATs function by altering the steady-state level or structure of mRNA via changes in transcription, mRNA stability, or splicing, and very few cases involve the regulation of sense mRNA translation. This study was designed to systematically search for cis-NATs influencing cognate sense mRNA translation in Arabidopsis (Arabidopsis thaliana). Establishment of a pipeline relying on sequencing of total polyA <sup>+</sup> and polysomal RNA from Arabidopsis grown under various conditions (i.e. nutrient deprivation and phytohormone treatments) allowed the identification of 14 cis-NATs whose expression correlated either positively or negatively with cognate sense mRNA translation. With use of a combination of cis-NAT stable over-expression in transgenic plants and transient expression in protoplasts, the impact of cis-NAT expression on mRNA translation was confirmed for 4 out of 5 tested cis-NAT:sense mRNA pairs. These results expand the number of cis-NATs known to regulate cognate sense mRNA translation and provide a foundation for future studies of their mode of action. Moreover, this study highlights the role of this class of noncoding RNAs in translation regulation
Simultaneous alignment and folding of protein sequences
Accurate comparative analysis tools for low-homology proteins remains a difficult challenge in computational biology, especially sequence alignment and consensus folding problems. We presentpartiFold-Align, the first algorithm for simultaneous alignment and consensus folding of unaligned protein sequences; the algorithmâs complexity is polynomial in time and space. Algorithmically,partiFold-Align exploits sparsity in the set of super-secondary structure pairings and alignment candidates to achieve an effectively cubic running time for simultaneous pairwise alignment and folding. We demonstrate the efficacy of these techniques on transmembrane ÎČ-barrel proteins, an important yet difficult class of proteins with few known three-dimensional structures. Testing against structurally derived sequence alignments,partiFold-Align significantly outperforms state-of-the-art pairwise sequence alignment tools in the most difficult low sequence homology case and improves secondary structure prediction where current approaches fail. Importantly, partiFold-Align requires no prior training. These general techniques are widely applicable to many more protein families. partiFold-Align is available at http://partiFold.csail.mit.edu
RFMirTarget: A Random Forest Classifier for Human miRNA Target Gene Prediction
Abstract. MicroRNAs (miRNAs) are key regulators of eukaryotic gene expression whose fundamental role has been already identified in many cell pathways. The correct identification of miRNAs targets is a major challenge in bioinformatics. So far, machine learning-based methods for miRNA-target prediction have shown the best results in terms of specificity and sensitivity. However, despite its well-known efficiency in other classifying tasks, the random forest algorithm has not been employed in this problem. Therefore, in this work we present RFMirTarget, an efficient random forest miRNA-target prediction system. Our tool analyzes the alignment between a candidate miRNA-target pair and extracts a set of structural, thermodynamics, alignment and position-based features. Experiments have shown that RFMirTarget achieves a Matthewâs correlation coefficient nearly 48 % greater than the performance reported for the MultiMiTar, which was trained upon the same data set. In addition, tests performed with RFMirTarget reinforce the importance of the seed region for target prediction accuracy
A Combinatorial Framework for Designing (Pseudoknotted) RNA Algorithms
We extend an hypergraph representation, introduced by Finkelstein and
Roytberg, to unify dynamic programming algorithms in the context of RNA folding
with pseudoknots. Classic applications of RNA dynamic programming energy
minimization, partition function, base-pair probabilities...) are reformulated
within this framework, giving rise to very simple algorithms. This
reformulation allows one to conceptually detach the conformation space/energy
model -- captured by the hypergraph model -- from the specific application,
assuming unambiguity of the decomposition. To ensure the latter property, we
propose a new combinatorial methodology based on generating functions. We
extend the set of generic applications by proposing an exact algorithm for
extracting generalized moments in weighted distribution, generalizing a prior
contribution by Miklos and al. Finally, we illustrate our full-fledged
programme on three exemplary conformation spaces (secondary structures,
Akutsu's simple type pseudoknots and kissing hairpins). This readily gives sets
of algorithms that are either novel or have complexity comparable to classic
implementations for minimization and Boltzmann ensemble applications of dynamic
programming
Detecting the Dependent Evolution of Biosequences
A probabilistic graphical model is developed in order to detect the dependent evolution between different sites in biological sequences. Given a multiple sequence alignment for each molecule of interest and a phylogenetic tree, the model can predict potential interactions within or between nucleic acids and proteins. Initial validation of the model is carried out using tRNA sequence data. The model is able to accurately identify the secondary structure of tRNA as well as several known tertiary interactions
RNA secondary structure analysis using the RNAshapes package
Reeder J, Giegerich R. RNA secondary structure analysis using the RNAshapes package. Current Protocols in Bioinformatics. 2009;26(1):Unit12.8
- âŠ